multi-label active learning
A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning
Multi-label classification (MLC) allows complex dependencies among labels, making it more suitable to model many real-world problems. However, data annotation for training MLC models becomes much more labor-intensive due to the correlated (hence non-exclusive) labels and a potential large and sparse label space. We propose to conduct multi-label active learning (ML-AL) through a novel integrated Gaussian Process-Bayesian Bernoulli Mixture model (GP-B$^2$M) to accurately quantify a data sample's overall contribution to a correlated label space and choose the most informative samples for cost-effective annotation. In particular, the B$^2$M encodes label correlations using a Bayesian Bernoulli mixture of label clusters, where each mixture component corresponds to a global pattern of label correlations. To tackle highly sparse labels under AL, the B$^2$M is further integrated with a predictive GP to connect data features as an effective inductive bias and achieve a feature-component-label mapping.
A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning
Multi-label classification (MLC) allows complex dependencies among labels, making it more suitable to model many real-world problems. However, data annotation for training MLC models becomes much more labor-intensive due to the correlated (hence non-exclusive) labels and a potentially large and sparse label space.
A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning
Multi-label classification (MLC) allows complex dependencies among labels, making it more suitable to model many real-world problems. However, data annotation for training MLC models becomes much more labor-intensive due to the correlated (hence non-exclusive) labels and a potential large and sparse label space. We propose to conduct multi-label active learning (ML-AL) through a novel integrated Gaussian Process-Bayesian Bernoulli Mixture model (GP-B 2 M) to accurately quantify a data sample's overall contribution to a correlated label space and choose the most informative samples for cost-effective annotation. In particular, the B 2 M encodes label correlations using a Bayesian Bernoulli mixture of label clusters, where each mixture component corresponds to a global pattern of label correlations. To tackle highly sparse labels under AL, the B 2 M is further integrated with a predictive GP to connect data features as an effective inductive bias and achieve a feature-component-label mapping.
Multi-Label Bayesian Active Learning with Inter-Label Relationships
Qi, Yuanyuan, Lu, Jueqing, Yang, Xiaohao, Enticott, Joanne, Du, Lan
The primary challenge of multi-label active learning, differing it from multi-class active learning, lies in assessing the informativeness of an indefinite number of labels while also accounting for the inherited label correlation. Existing studies either require substantial computational resources to leverage correlations or fail to fully explore label dependencies. Additionally, real-world scenarios often require addressing intrinsic biases stemming from imbalanced data distributions. In this paper, we propose a new multi-label active learning strategy to address both challenges. Our method incorporates progressively updated positive and negative correlation matrices to capture co-occurrence and disjoint relationships within the label space of annotated samples, enabling a holistic assessment of uncertainty rather than treating labels as isolated elements. Furthermore, alongside diversity, our model employs ensemble pseudo labeling and beta scoring rules to address data imbalances. Extensive experiments on four realistic datasets demonstrate that our strategy consistently achieves more reliable and superior performance, compared to several established methods.
Multi-Label Active Learning: Query Type Matters
Huang, Sheng-Jun (Nanjing University of Aeronautics and Astronautics) | Chen, Songcan (Nanjing University of Aeronautics and Astronautics) | Zhou, Zhi-Hua (Nanjing University)
Active learning reduces the labeling cost by selectively querying the most valuable information from the annotator. It is essentially important for multi-label learning, where the labeling cost is rather high because each object may be associated with multiple labels. Existing multi-label active learning (MLAL) research mainly focuses on the task of selecting instances to be queried. In this paper, we disclose for the first time that the query type, which decides what information to query for the selected instance, is more important. Based on this observation, we propose a novel MLAL framework to query the relevance ordering of label pairs, which gets richer information from each query and requires less expertise of the annotator. By incorporating a simple selection strategy and a label ranking model into our framework, the proposed approach can reduce the labeling effort of annotators significantly. Experiments on 20 benchmark datasets and a manually labeled real data validate that our approach not only achieves superior performance on classification, but also provides accurate ranking for relevant labels.
- Asia > China > Jiangsu Province > Nanjing (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > United States > Washington > King County > Bellevue (0.04)
- (14 more...)